Regression for Linguists
  • D. Palleschi
  1. Overview
  2. Resources and Set-up
  • Overview
    • Course overview
    • Syllabus
    • Resources and Set-up
  • Day 1: Simple linear regression
    • 1  Understanding straight lines
    • 2  Simple linear regression
    • 3  Continuous predictors
  • Day 2: Multiple regression
    • 4  Multiple Regression
    • 5  Categorical predictors
  • Day 3: Logistic regression
    • 6  Logistic regression
  • Day 4: Mixed models I
  • Day 4: Mixed models II
  • Day 5: TBD

Inhaltsverzeichnis

  • Resources
  • Assumptions about you
  • Software
    • Install R
    • Install RStudio
    • Install LaTeX
  • resources
    • Troubleshooting (EN: Troubleshooting)

Resources and Set-up

Autor:in
Zugehörigkeit

Daniela Palleschi

Humboldt-Universität zu Berlin

Veröffentlichungsdatum

4. Oktober 2023

Error: not found: Winter_2013, Winter_2014, sondregger_regression_2023, baayen_2008, jaeger_2008, barr_2013, matschucek_2017, wickham_r_nodate

Resources

This course is mainly based on Winter (2019), which is an excellent introduction into regression for linguists. For even more introductory tutorials, I recommend going through (Winter_2013?) and (Winter_2014?). For a more intermediate textbook, I’d recommend (sondregger_regression_2023?).

If you’re interested in the foundational writings on the topic of linear mixed models in (psycho)linguistic research, I’d recommend reading (baayen_2008?); (jaeger_2008?); (barr_2013?); (matschucek_2017?).

Assumptions about you

For this course, I assume that you are familiar with more classical statistical tests, such as the t-test, Chi-square test, etc. I also assume you are familiar with measures of central tendency (mean, median, mode) measures dispersion/spread (standard deviation), and with the concept of a normal distribution. Lacking this knowledge will not impeded your progress in the course, but is an important foundation on which we’ll be building. We can review these concepts in-class as needed.

Software

  • R: a statistical programming language (the underlying language)

  • RStudio: an program that facilitates working with R; our preferred IDE integrated development environment

  • LaTeX: a typesetting system that generates documents in PDF format

  • why R?

    • R and RStudio are open-source and free software
    • they are widely used in science and business

Install R

  • we need the free and open source statistical software R to analyze our data
  • download and install R: https://www.r-project.org

Install RStudio

  • we need RStudio to work with R more easily
  • Download and install RStudio: https://rstudio.com
  • it can be helpful to keep English as language in RStudio
    • we will find more helpful information if we search error messages in English on the internet
  • If you have problems installing R or RStudio, check out this help page (in German): http://methods-berlin.com/wp-content/uploads/Installation.html

Install LaTeX

  • we will not work with LaTeX directly, but it is needed in the background
  • Download and install LaTeX: https://www.latex-project.org/get/

resources

  • many aspects of this course are inspired by (nordmann_applied_2022?) and (wickham_r_nodate?)
    • both freely available online (in English)
  • for German-language resources, visit the website of Methodengruppe Berlin

Troubleshooting (EN: Troubleshooting)

  • Error messages are very common in programming, at all levels.
  • How to find solutions for these error messages is an art in itself
  • Google is your friend! If possible, google in English to get more information

Literaturverzeichnis

Baayen, R. H. (2008). Analyzing Linguistic Data: A Practical Introduction to Statistics using R.
Baayen, R. H., & Shafaei-Bajestan, E. (2019). languageR: Analyzing linguistic data: A practical introduction to statistics. https://CRAN.R-project.org/package=languageR
Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
Winter, B. (2013). Linear models and linear mixed effects models in R: Tutorial 1.
Winter, B. (2014). A very basic tutorial for performing linear mixed effects analyses (Tutorial 2).
Winter, B. (2019). Statistics for Linguists: An Introduction Using R. In Statistics for Linguists: An Introduction Using R. Routledge. https://doi.org/10.4324/9781315165547
Syllabus
1  Understanding straight lines
Quellcode
---
author: "Daniela Palleschi"
institute: Humboldt-Universität zu Berlin
# footer: "Lecture 1.1 - R und RStudio"
lang: de
date: "`r Sys.Date()`"
format:
  html:
    output-file: kursuebersicht_blatt.html
    number-sections: true
    number-depth: 3
    toc: true
    code-overflow: wrap
    code-tools: true
    self-contained: true
    fig-width: 6
  pdf:
    output-file: course_overview.pdf
    toc: true
    number-sections: true
    colorlinks: true
    fig-width: 4
    code-overflow: wrap
bibliography: references.bib
csl: apa.csl
execute: 
  eval: true # evaluate chunks
  echo: true # 'print code chunk?'
  message: false # 'print messages (e.g., warnings)?'
  error: true # ignore errors when rendering?
  warning: false
---

# Resources and Set-up {.unnumbered}

```{r, eval = T, cache = F}
#| echo: false
# Create references.json file based on the citations in this script
# make sure you have 'bibliography: references.json' in the YAML
rbbt::bbt_update_bib("00-course_overview.qmd")
```

# Resources

This course is mainly based on @winter_statistics_2019, which is an excellent introduction into regression for linguists. For even more introductory tutorials, I recommend going through @Winter_2013 and @Winter_2014. For a more intermediate textbook, I'd recommend @sondregger_regression_2023.

If you're interested in the foundational writings on the topic of linear mixed models in (psycho)linguistic research, I'd recommend reading @baayen_2008; @jaeger_2008; @barr_2013; @matschucek_2017.
    
# Assumptions about you

For this course, I assume that you are familiar with more classical statistical tests, such as the t-test, Chi-square test, etc. I also assume you are familiar with measures of central tendency (mean, median, mode) measures dispersion/spread (standard deviation), and with the concept of a normal distribution. Lacking this knowledge will not impeded your progress in the course, but is an important foundation on which we'll be building. We can review these concepts in-class as needed.

# Software {#sec-software}

- R: a statistical programming language (the underlying language)
- RStudio: an program that facilitates working with R; our preferred IDE integrated development environment
- LaTeX: a typesetting system that generates documents in PDF format

- why R?
  -  R and RStudio are open-source and free software
  -  they are widely used in science and business

::: {.content-hidden when-format="pdf"}
::: {.column width="30%"}
```{r eval = F, fig.env = "figure", out.width="50%", fig.align = "center"}
#| echo: false

magick::image_read(here::here("media/R_logo.png"))
```
:::

::: {.column width="30%"}
```{r eval =F , fig.env = "figure", out.width="75%", fig.align = "center"}
#| echo: false

magick::image_read(here::here("./media/RStudio_logo.png"))
```
:::
:::

```{r eval = F, fig.env = "figure", out.width="75%", fig.align = "center"}
#| echo: false

magick::image_read(here::here("./media/LaTeX_logo.png"))
```


::: {.content-visible when-format="pdf"}
```{r eval = F, fig.env = "figure", fig.pos="H", out.width="75%", fig.align = "center"}
#| echo: false

R <- grid::rasterGrob(as.raster(png::readPNG(here::here("./media", "R_logo.png"))))

RStudio <- grid::rasterGrob(as.raster(png::readPNG(here::here("./media", "RStudio_logo.png"))))

latex <- grid::rasterGrob(as.raster(png::readPNG(here::here("./media", "LaTeX_logo2.png"))))

gridExtra::grid.arrange(R, NULL, RStudio, NULL, latex, ncol=5,
                        widths=c(.25,.125,.25,.125,.25))
```
:::

## Install R

- we need the free and open source statistical software R to analyze our data
- download and install R: <https://www.r-project.org>

## Install RStudio

- we need RStudio to work with R more easily
- Download and install RStudio: <https://rstudio.com>
- it can be helpful to keep English as language in RStudio
    - we will find more helpful information if we search error messages in English on the internet

- If you have problems installing R or RStudio, check out this help page (in German): <http://methods-berlin.com/wp-content/uploads/Installation.html>

## Install LaTeX

- we will not work with LaTeX directly, but it is needed in the background
- Download and install LaTeX: <https://www.latex-project.org/get/>

# resources

- many aspects of this course are inspired by @nordmann_applied_2022 and @wickham_r_nodate
    - both freely available online (in English)
- for German-language resources, visit the website of [Methodengruppe Berlin](http://methods-berlin.com/de/r-lernplattform/)

## Troubleshooting (EN: Troubleshooting)

- Error messages are very common in programming, at all levels.
- How to find solutions for these error messages is an art in itself
- Google is your friend! If possible, google in English to get more information

# Literaturverzeichnis {.unlisted .unnumbered visibility="uncounted"}

::: {#refs custom-style="Bibliography"}
:::